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ABSTRACT 

Extracting Times of Arrival from pulsar radio signals depends on the knowledge of the pulsars 
pulse profile and how this template is generated. We examine pulsar template generation 
with Bayesian methods. We will contrast the classical generation mechanism of averaging 
intensity profiles with a new approach based on Bayesian inference. We introduce the 
Bayesian measurement model imposed and derive the algorithm to reconstruct a “statistical 
template” out of noisy data. The properties of these “statistical templates” are analysed both 
with simulated and real measurement data from PSR B1133+16. We explain how to put this 
new form of template to use in analysing secondary parameters of interest and give various 
examples; 

We show how to reconstruct a detuned measurement’s phase shifts and demonstrate how 
to discriminate between different modes of radiation by implementing a nulling detection. 
Combining elements of the former, we implement a nonlinear filter for determining ToAs of 
pulsars. Applying this method to data from PSR J1713+0747 we derive ToAs self consis¬ 
tently, meaning all epochs were timed and we used the same epochs for template generation. 
The phase shift reconstruction is found to measure a shift in simulated data up to the estimated 
statistically possible accuracy out of noisy data. 

Average templates as well as Bayesian templates are subject to uncertainties by fluctuations 
and noise. While the average template contains these as unavoidable artifacts, we find that 
the “statistical template” derived by Bayesian inference quantifies fluctuations and remaining 
uncertainty. This is why the algorithm suggested turns out to reconstruct templates of 
statistical significance from as few as ten to fifty single pulses. 

A moving data window of fifty pulses, taking out one single pulse at the beginning and 
adding one at the end of the window unravels the characteristics of the methods to be 
compared. It shows that the change induced in the classical reconstruction is dominated by 
random fluctuations for the average template, while statistically significant changes drive the 
dynamics of the proposed method’s reconstruction. 

The analysis of phase shifts with simulated data reveals that the proposed nonlinear algorithm 
is able to reconstruct correct phase information along with an acceptable estimation of the 
remaining uncertainty. 


Key words: pulsars: general - methods; data analysis - methods; analytical - methods; statis¬ 
tical - methods: numerical - pulsars: individual (B1133+16, J1713+0747) 


1 INTRODUCTION 

Pulsars are well known for their high precision measurements, par- 
ticu larly in areas of fundamental physics s uch as general relativ¬ 
ity dCordes et al.ll2004 iKramer et al]|2006h . These measurements 
make use of the exceptional rotational stability of the pulsar as ob¬ 
served via its electromagnetic pulses, most often observed in the 
radio. Most of these analyses rely on determining the arrival times 
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of the pulses out of tens of thousands integrated single pulses. 

The individual (or single-) pulses observed from a pulsar are ex¬ 
tremely variable, not only in flux but also in the shape of the pulse. 
It is only when 10s of 1000s of pulse s are averaged in an i ntegrated 
pulse profile that it becomes stable dHelfand et al.ll 19751 ; iLiu et al.l 
l2012h . Some pulsars exhibit temporal variations in this pulse pro¬ 
file; showing no emission for a period of time - nulling (see Sec. 
[T5] >. or switching between one or more additional stable profiles 
(see Sec. l3.7t . however for the majority of pulsars the pulse profile 
remains stable, in some cases over at least years). 
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There exist ready to use toolsets like PSRCHIVEiH otan et all 
l2004h and TEMP02 jEdwards et alll2006l: lifobbs et aljboodr to 


analyse raw observation data in order to assign times of arrival to 
integrated observations and constrain the parameters of the physical 
model under consideration. The remaining differences between the 
model’s prediction of an arrival time and the actual time measured, 
also known as timing residuals, can be as low as a few tens or hun¬ 
dreds nanoseconds on average while the periods of observational 
data spans several years. This amounts to tracking the rotational 
phase of the pulsar at a certain observational epoch to an accuracy 
of 10“"^ and well below. In the light of a single pulse being a highly 
variable object that apparently cannot be tracked down to that ac¬ 
curacy (see e.g. fig. EJ it becomes clear that this very exact phase 
information is imprinted in and has to be recovered out of tens of 
thousands of very individual pulses. 

The classical way of generating times of arrival (ToAs) is based on 
techniques comparing the shift of an epoch under consideration to a 
so called template. The template is often an integrated pulse profile 
of the entire dataset for that pulsar (at the appropriate observing 
frequency and bandwidth). Usually it is then fit with an analytic 
function to produce a noiseless template. These templates are oc¬ 
casionally updated and the ToAs reproduced. 

Observations integrated over usually shorter times are then tested 
for a phase shift w.r.t. the template. The user of PSRCHIVE e.g. has 
the choice among five algorithms all comparing the data of an ob¬ 
servation to the reference and yielding a relative shift and an error 
estimate on it for example by discrete cross-correlation an d inter¬ 
polation to yield sub-bin accuracy like iHotan et"^ (l2005h or ex- 
amining the p hase gradient in the fourier space representation as in 
iTavlodm^ to name just a few. The phase shift is then converted 
into a time of arrival using the timestamp and folding period of the 
observation. In a next step, the actual pulsar timing is carried out. 
Pulsar timing is the process by which a rotational model of the pul¬ 
sar is produced (the ephemeris) and the value of its rotation phase 
is mapped to the ToA at the observatory of every pulse. This is then 
fit to the ToAs from observations to produce a phase coherent so¬ 
lution that accounts for every rotation of the pulsar from the first 
observation to the most recent. It is the coherence of this solution 
that gives pulsar timing its extraordinary precision. 

This constitutes the classical procedure of pulsar timing. Both the 
template and the observed profile to be converted to a ToA rely on 
integrated and thus averaged data before correlating. Single pulses 
cannot be compared directly for they are often not bright enough 
to be distinguished from noise introduced on the signal’s way to 
the telescope and by the antenna-receiver-amplifier system. Eur- 
thermore their distinct and individual shape makes it difficult to 
correlate them to the average profile as simply matching the shape 
of the pulse against the average is not possible. Thus timing indi¬ 
vidual pulses by comparing them to a template lacks the necessary 
precision. 

For these reasons and yielding a such precise timing the classical 
way of integrating the single pulses to yield a stable and confident 
profile justifies itself. However in actual observations, the reduced 
value, a measure for the believed precision of the method com¬ 
pared to the actual residua l error of the timing model, is much larger 
than unity (see e.g. lBailesI (l2010h ) pointing to an underestimation of 
the error by the classical procedure and algorithms. This additional 
error is believed to have its root in additional noise sources like e.g. 
scattering by the interstellar medium. A comprehensive overview 
of the noise budgets can be found e.g. in ICordes & ShannonI 1 I 2 OIOI) . 
By assessing and fitting the possibly introduced effects on the 
ToAs, mitigating errors after determining ToAs has been inten¬ 


sively investigated IColes et alj ( l201lh : ICordes & ShannonI ( l2010h . 
As Bayesian methods already have been successfully used for cor¬ 
relating ToA-data from different pulsars, extending these to miti¬ 
gate the timing errors by supposing a Bayesian timing model on 
the ToAs as described by Lee et al.l l l2014h : Ivigeland & Vallisneril 
( l2013lklLentati et all <2014) ^ and others suggests itself. 

The shape of single pulses is affected mainly by the short term 
noise contributions coming from current observational conditions 
one can aptly paraphrase as interstellar weather and the magne¬ 
tosphere or radiation process intrinsic pulse jitter. Both influences 
affect mainly the single pulse statistics or influence the profile over 
short timescales, but can s ystematically influenc e timing and even 
dominate timing precision l lOslowski et al.ll2oTl]) . 

Increasing the gain of the observations with future systems like the 
Square Kilometre Array (SKA) aggravates this problem and de¬ 
mands for longer integration times if no solution is found to incor¬ 
porate the single pulse fluctuations into the p roblem of determ ining 
ToAs rather than trying to average them out iLiu et alj|2012h . 
Finding a generic way of dealing with single pulse indivituality can 
ameliorate the situation and fill the statistical gap between single 
pulses and an integrated profile without the need to take further 
modelling assumption^ 

In this paper we will show a way to address the problem of gen¬ 
erating pulsar ToAs with single pulse statistics and the benefits of 
having a more accurate statistical representation for single pulses’ 
behaviour. By using more information than the classical method 
(which integrates this information) we hope to improve the pre¬ 
cision of pulsar ToA generation and to allow ToAs from shorter 
observations to be unbiased by pulse jitter. 

We introduce a model for statistics of the single pulses and substi¬ 
tute the classical template by a “statistical template” representing 
the single pulse statistics. We will then work out the steps of sta¬ 
tistical template generation and phase shift detection replacing the 
classical counterparts. 

Hereto we will derive a measurement model for squared and am¬ 
plified receiver voltage in Sec.j^and argue how to handle the sig¬ 
nals from pulsars best, splitting them in a radiation process and 
a phase coupled template part. Using the Bayesian formalism, we 
will deduce the posterior distribution for both the amplitude and 
phase model introduced and thus show a way to infere a statistical 
template from input data. Though focusing on the important ap¬ 
plication of generating ToAs, we will also demonstrate the benefit 
of using statistical templates on other applications such as deter¬ 
mining nulling and analysing a moding pulsar. Sec. will evaluate 
the proposed algorithm and methods using both simulated and real 
pulsar data also giving insight on how a statistical template can en¬ 
hance the view on the pulsars profile from a statistical perspective. 
Comparing the generation of ToAs to the classical results we will 
highlight improvements both in timing residuals as well as the 
value of the fit. We summarize our findings in Sec.|4] 


2 MODELLING PULSAR MEASUREMENTS 

2.1 Terminology of the used statistical model for single pulses 

In order to describe single pulse statistics we use the notion of a 
fluctuating envelope besides to the classical set of terms of single 
pulses and integrated profiles. 

^ Incorporating assumptions on e.g. scattering of the ISM or pulse jitter 
behaviour however may further improve the results presented 
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pulsar phase 


Figure 1. Scheme for single pulse statistics as described in Sec. o 


Instead of modeling an average profile, we assume a single pulse 
to be the result of multiplying a stationary Gaussian random noise 
process, which amounts to the physical process that generates the 
radiation in the pulsar magnetosphere, with a possibly unstable en¬ 
velope (see figd)- This envelope models the periodic influence of 
the magnetosphere. The uncertainty in the envelope may reduce 
with observation time when the envelope is not fluctuating. How¬ 
ever if the magnetosphere conditions for some region mapping to 
the pulse phase are fluctuating, the envelope will stay fluctuating in 
that region. 

Only in the limit of no uncertainty the envelope would equal the 
(scaled) average of the pulses and amount to the same thing as a 
classical integrated profile. As averaging single pulses generates a 
pulse profile, we emphasize that there are many possible fluctuat¬ 
ing envelopes which are integrated to the same profile. 

The empiric or analytic template that is used for generating times 
of arrival will consequently be substituted by a Bayesian statistical 
template describing the fluctuating envelope. 


2.2 Amplitude model 

Every Bayesian analysis is based on formulating a signal and noise 
model to describe the statistical imprint of a certain signal in the 
data. Thus, before inserting the outlined statistical model of the 
signal , we have to understand the influence of an arbitrary signal. 
Therefore we will introduce the basic measurement model, insert 
our signal model and then derive the formulae for signal recon¬ 
struction using Bayes’ theorem. 

Typical pulsar measurements at radio frequencies give the intensity 
of received radiation by suitable sampling of d, the quadrature of 
the received voltage. The problem is to infer a unknown signal s 
(which in our case amounts to the noise-free series of single pulses 
as emitted from the pulsar) from the noisy data d. The noise - leav¬ 
ing aside radio interference - is dominated by thermal fluctuations 
of both receiver and antenna and is accurately described by white 
noise of a usually known variance cr^ entering before quadrature of 
the signal. Thus, the baseline corrected data d measured, given a 
signal s and a random noise n amounts to 

d=(n-|-s)^ (1) 

d — -I- 2sn (2) 


where 


P(d|s, n) = (5(d — (n + s)^) (3) 

P(n)=gn(0,N) (4) 

where T(x\y) is the short notation for the probability density of 
X = X given Y = y, where X, Y would be the correspond¬ 
ing random variables. We use boldface lower case letters to em¬ 
phasize that data, signal and noise consist of many single values 
d = {di, d2, ■ ■ ■ dt, ■'' } for which the equations stated are valid 
component-wise. Thus operations like taking a root of a vector or 
multiplying two vectors are carried out only on the components 
of the vector. For the scalar product of vectors a and b we use 
the notations a^b and Vat a = ||a||. We abbreviate det(-) as 
I • I and define (5(a) := S{ak). Putting a hat over a vector 
V = diagmat(v) denotes constructing a diagonal matrix with a 
diagonal with the elements ofv. When dealing with sets, A \ g de¬ 
notes the set A without the elements g. 

f/x(ni, V) denotes that x is distributed as the multivariate Gaussian 
distribution with mean m and covariance matrix V. It is defined as 

ex(tn,V) — —=i=exp[-i(x-m)fV^(x-m)] 

Y/27r|V| ^ 

Integrating over a vector a is denoted by P(b) = 
ffDaP(a,b) = ff dai ■ ■ ■ dakV(a,b). 

The classical way of determining the template intensity amounts 
to averaging eq. © over several pulsar periods, noticing that 
the term mixing signal and noise vanishes leaving us with the 
desired quantity of (s^) when subtracting an from the averaged 
data. To uses Bayes’ theorem, we will instead calculate the exact 
probability distribution to measure d at a certain time, given a 
certain signal value s at that time: 


P(d|s) = 


DnP(n, d|s) = 
Dn5(d — (n -f s)^ 


DnP(d|n, • ■ • )P(n) 
1 


V\^\ 




(5) 


( 6 ) 


We substitute n = (n-f- s)^, n = ±Vn — s, dn = and notice 

that -^INI = , where Ntot is the total number of samples in 

all bins. 




'^Pho 


1 ||x/n± I 


-exp[- 


l||Vd±s|| 


y|27i^(j^‘°‘ ^'2 (t; 

where we now have to assign a signal model for s. We assume 


(7) 

( 8 ) 


s = exp[f] • gan (9) 

Where we have decided for the signal to be measured in units of 
noise temperature of the receiver-antenna system. The ±-operator 
means that the whole density would be a sum of the densities with 
the respective sign. This operator vanishes later on, when integrat¬ 
ing over g and using P{g) = P{—g). f and g are Gaussian random 
variables with also to be determined covariances F and G which 
describe the statistical properties of the envelope respectively the 
radiation process. The exponential form for the envelope was cho¬ 
sen since on one hand it is able to grasp the occurring fluctuations 
to high radiation densities of single pulses and on the other we may 
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easily assume a scale-free prior for the amplitude, as we will see 
below. 

We assume the radiation process to have stationary statistics given 
by 

G = G(f - t') 

and the part that is generating the observed envelope to be peri¬ 
odic in time. Even though the covariance matrix of exp[/] alone 
has non-diagonal components, the product’s covariance matrix 
((/(f)exp[/](f) 5 r'(f)exp[/]'(f)) has only diagonal components if 
we marginalize over g, because of the stationarity of g as assumed 
in (ED. Thus we describe the covariance matrix solely by it’s 
Fourier-coefficients al — k ■ ^. Since the expectation of the 
signal’s covariance matrix for a certain g integrated over a specific 
pulse certainly does not vanish, the non-diagonal components of 
the signal’s matrix carry information about g which we will dis¬ 
card in this paper. T denotes the real pulsar period, which we will 
discriminate from the assumed period r later on. 


2.2 .1 Joint Probability and Information Hamiltonian 

For a moment, let us assume the covariance matrices F and G are 
known. Then we may write for the joint probability 


P(f^g^l^) = <5(exp[f]gan - s)ef(0,F)eg(0, G)P(r) 
— :Au{s} —-.B 

and we define the parameter sets A and B where P(t) is the prob¬ 
ability density over the assumed period r. In order to define the 
Information Hamiltonian Jffl [A, d] := — log P(zl, d|P) we no¬ 
tice that ViA, d|i3) = ff DsP(d|s)P(j4 U {s}|P) and use the 
measurement model given by 


-logPiA,d\B) = Hb[A] = 

(\/d - exp[f] • g(Jn)^^(x/d - exp[f] ■ gcrn)-|- 
f^F-if+ gtG-ig-b] - 

-logP(r) + ilog(|2^a^|'^‘°‘|27rF||2^G||d|) (10) 

Since we are interested in the pulsar’s statistical template given by 
the envelope characteristics we marginalize over the radiation pro¬ 
cess g by integrating it out. This calculation is outlined in Sec. IM] 
We are left with 


1 

2 


Hb{A] 


i x/d^-!j[l-I-exp[f]Gexp[f]] ^v/d-ff^F ^f-b 
Z cr^i 


+ log(|27rDf-i||27ra2|^‘°‘|2^F||27rG||d|)] -logP(r) 

( 11 ) 


where the operator [1 -|- exp[f]Gexp[f]]“^ together with the noise 
variance is a variation o f the operator leading to the well-known 
case of the Wiener filter ( IWieneiiri949l) . This similarity arises due 
to the integration over the Gaussian random field g added upon 
the signal field s. However, the normalisation w.r.t. the data dif¬ 
fers and we further distinguish between the periodic part described 
by (f, F) and the radiation part now showing up only as G. The 
functioning of the operator can be understood by considering its 
interplay with given data. Maximizing probability equals minimiz¬ 
ing the Hamiltonian. Thus high data values will be compensated by 
large mean values of f. Conversely, for too high values of f the exact 


value of the data becomes irrelevant, leading to a sub-optimal so¬ 
lution independent of the data. The optimal solution resembles the 
actual probability distribution the data would be drawn from. As 
a function of d, the joint probability is a special case of a gamma 
distribution oc exp[—1| |]||d||where m is the mean value of 

the distribution. In this light the operator resembles the classical 
formula (d) — m — + since m = crn[l -|- for our case. 

G becomes diagonal in Fourier space since g is assumed to be sta¬ 
tionary. As exp[f] is a periodic signal, its Fourier space is limited 
to discrete frequencies that are a multiple of the assumed periodic¬ 
ity. We may absorb the values of G„, where ojk ~ k ■ — into F. 
In the following we will discard information about G from single 
pulses. Since our main interest lies on the statistics of the envelope, 
we may take this loss. 

Under these assumptions, and neglecting non-diagonal terms aris¬ 
ing when Fourier transforming / (see Sec.lB]and l2.2.3b . the Gui =: 
may be absorbed into an effective mean f as is evident in eq. 

O: 

exp[2/fc] • = exp[2(/fc -f J^)] 


where /j, = log ag^.. We redefine f in i ll lb in that way and yield 


HB[A\g] = ^ 


d 1 
all + exp[2f] 


-F(f-f)tF-i(f-f) + 


+ log(|2^Df-i||2^a^|'^*°*|27rF||d|)] -logP(r) (12) 


where G is now the unity matrix and we assume F ^ to be diagonal 

with components —. We arrive at our model for a single pulse’s 

‘"fk 

amplitudes in Fourier space, assuming all timing parameters to be 
given: 






1 

2 


d 1 

0-2 1 -F exp[2f] 


CTf 


-FlogG(F,an,d)] (13) 


This is the negative logarithm of the likelihood function. Bayes’ 
law gives us the posterior 


P(f|d) 


exp[-Hf ^2[f,d]] 
//Dfexp[-Pf ^ 2 [f,d]] 


(14) 


when we assume a flat prior on f which effects in a scale-free prior 
on the assumed amplitude, since s oc exp[f] is exponentiated. 


2.2.2 The receiver equation’s pdf 

Assuming G to be a diagonal of ones (meaning vanishing corre¬ 
lations) for this argument, the pdf of the receiver equation of each 
bin decouples and for a specific bin and different pulses K takes 
the form of 

P(/|dl oc nexphi^j^IIlJII/Vdl 

which simply states that the variance of s/cTk can be explained as 
sum of the variance of the noise (the one in the denominator of 
the exponentiated term) plus the variance of g ( equals unity per 
definition) times the signal strength of the envelope /. The square 
root term of Jk after the exponential is a normalization factor w.r.t. 
/. We deduce that for a constant signal envelope / and vanishing 
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Fourier transforming a whole epoch 
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Figure 2. Fourier transform of a whole epoch of pulsar B0329+54. The x axis has been scaled to the pulsar’s signal harmonics. Inset: detail of one spike. 


Error estimates on single measurement 



signal in an 


to be most uncertain at the highest signal values while the areas of 
lowest uncertainty can he found where / ~ 1. 

We emphasize that this relative uncertainty will not be integrated 
out by measuring more pulses, as the uncertainty in hoth high and 
low signal areas equally fall with the sqare root of Nk leaving the 
ratio of the remaining error untouched. This fact explains the often 
expressed observation that stronger pulses do exhib it larger tim¬ 
ing errors ^Shannon et alJ[20l^ : lOslowski et al.ll20nh. The reaso n 
was found imp licitly hy numerous authors (e.g. Kulkamil ( Il989h : 
I Ricked il975h ) for the case of self-correlation of the signal. We 
emphasize that even without self-correlation, we expect the higher 
pulses to have a higher intrinsic uncertainty. 

The non-diagonal terms introduced hy G mark the departure from 
simply averaging the data. Let us review the influence of a non¬ 
white stationary process on the data’s spectrum. 


Figure 3. Comparison of absolute and relative error (as uncertainty over 
actual signal strength) for a single measurement of a signal / 

correlations of g 

W)ocexp[-i^^]yi^ 

= (15) 

where Nk is the number of pulses involved. This probability den¬ 
sity function attains it’s maximum for (d) ^ 1 at / = y/ {d) — 1 
justifying the average data subtracting the noise background as the 
maximum likely guess. Carrying out a saddle point analysis, we 
conclude that 

(1 + /=)“ _ /‘■“ t «1 

\ 4 ^! f«/»i 

For a signal that is constant and not the modulation of a stationary 
noise process, we would have expected ctc oc \JcINk as c 1 
such that the relative error decreases with signal strength. In our 
case the relative error decreases too, but only to reach 1 as a limit 
(see Fig.O. Against our intuition we expect the shape of the profile 


2.23 The influence of g on the signal’s spectrum 

When investigating a Fourier transform of a whole observation’s 
time series containing several hundred pulses (see fig. two 
prominent features dictate the spectrum: Spikes of the pulsar’s sig¬ 
nal dominate over receiver background noise. If the signal was 
strictly periodic, we would expect it to be described solely in terms 
of the Fourier coeffieents of harmonics of the base frequency 1/P. 
As the pulsar’s signal contains periodicity, we expect spikes to 
show up in this case totQ However, we are facing a stochastic pro¬ 
cess g that is modulated via a periodic envelope /. As is commonly 
known, multiplication in time domain translates to convolution in 
Fourier domain. As g{at) is expected to be a red noise process, we 
expect it to cast the sharp pulsar signal to finite width in Fourier 
space. Indeed - as the inlay of the figure shows - the signal’s power 
is leaked to the neighbouring bins. The width of the spectral line 
however is rather narrow, pointing to a fast decline of the spectrum 


^ these spikes are not an artifact of the folding period for the archive as 
this folding is discarded in the analysis. Of course the folding period of 
observations is set as close to the true period P as possible to be able to 
integrate the data easily by adding up the receiver bins’ values (in soft- or 
hardware) 
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of g in Fourier space. Consequently, the spectrum of g in time do¬ 
main is spread out widely, pointing to non-vanishing correlations 
over a long time. These correlations diminish the independence of 
both adjacent lying bins and the bins of neighbouring pulses. 

As the multiplication f{t)-g{t) amounts to a convolution in Fourier 
space, evaluating the posterior distribution in Fourier space exactly 
was found to be unfeasible. Instead, we approximated the convo¬ 
lution by taking only the diagonal terms, losing mathematical rig- 
orosity and precision but keeping the problem solvable in reason¬ 
able computational time. The interested reader may refer to Sec.l^ 
of the appendix for a more detailed discussion why solving this in 
time domain would be favorable, but currently is not feasible. 

The clear departure from the average comes from the off-diagonal 
elements of x/dMx/di^ where M is the operator derived. As is ev¬ 
ident from the structure of M, it is expected to be off-diagonal only 
until the correlations introduced by G{t — t') decay. Thus, writing 
this equation as a matrix equation, symmetry tells us that we can 
possibly store only a limited combination of \/d{t — At)Vd [t + 
At) values and average them as they are multiplied by the same 
number when performing the matrix multiplication. The matrix M, 
subject to the exact envelope shape and correlations of the radiation 
process can be calculated afterwards and former data may be re¬ 
processed with the new M in that way without losing the statistical 
information from the single pulse level, even though we may inte¬ 
grate parts of the single pulse data to yield smaller data sets. This 
could possibly be exploited in the future to profit from single pulse 
statistics without having to store all single pulses. 


2.2.4 Maximum-a-posteriori-Filter 

We are interested in finding the f, at values most compatible with 
the data and thus set out to maximize the a-posteriori distributior0 
given the data. For a prior specified by a/,p and fp we derive the in¬ 
tegral over f of (O w.r.t. f and expect the derivative to vanish. This 
procedure yields an implicit formula for maximizing the posterior 
distribution of f. 


I 

0 = 

0 = 


p p ^^max ^max / p £ \ 

// Df E E \fK,,,dKA] 

*7 J u'—n i —A P 


EE/ 


{fK,k - fk). 


K;p[-iTj 2 [fK,k,dK,k] 


which may be solved for every k independently as 


/fc}exp[-ir^ 2 IfK.k^dK.k]] 
fk ~ {fK,k)expl-Hj o [fK,k,dK,k]] 


(16) 


where we organize our data in pulse numbers indexed K and have 
summed over single Fourier coefficients dk of these data of pulse 
K. 

The new variance assumed of erf may be calculated equally by eval¬ 
uating the expectation value of {fk,K — f Cd , since deriving w.r.t. 


® Maximum A Posteriori or MAP refers to taking the most likely value of 
the posterior as best guess 


(Tfc yields; 


0 = / [fK.k,dK,k]] 

-^max ^max p 

0 = E E (^fdiA- 

K=0 fc=0 '' 


,{fK,k-fk)^ 1 


-)exp[- 

^ k 


which may be solved for every k independently as 


^^max 

(((/tfA “ /fe) ~ <^k))exp[-Hj ^2 [/k.IcAk-.IcII 
K=0 fp 

= \l{{fK,k - /fc)^)exp[...] (17) 

Doing this once is assuming a prior of the form of the initial 
f, (Tf. A very general choice may be suitable. Then taking the 
expectancy values of as prior input to d and iterating is 
kn own as the expectation -maximization (EM) algorithm proposed 
bv IPemnster et al.l (Il977h . We may in this way in principle get rid 
of assumptions going into the first prior by iterating since this pro¬ 
cedure converges on a prior fully compliant with the data, assum¬ 
ing gaussian distribution of f. Flowever the EM algorithm may only 
find local maximum which equals a global maximum if the pulsar 
is really radiating with a log-normal probability distribution. The 
optimization problem in both f and erf may be solved using a suit¬ 
able iteration method (see e.g. appendix lO. We assumed here that 
the pulsar is sufficiently described by a log-normal distribution and 
that the dataset converges to the same (global) fix point for all initial 
pairs of values. 


2.3 Detuning model 

The problem of determining ToAs is closely related to a phase shift 
over the analyzed single pulses. As a wrongly set folding period can 
cause phase-drifts over the dataset’s pulses, we will study the influ¬ 
ence of such a detuning on the coefficients of a signal to exclude 
mistaking a phase-drift over the dataset as a phase shift when de¬ 
termining ToAs. Analysing Fourier transformations of finite (con¬ 
secutive) pulses allow us to measure a phase-drift corresponding to 
the detuning of the assumed periodicity r and the real periodicity 
T = -, where a is a correction factor. The equations denoted with a 
letter before the number are derived in appendix |A2] For the signal 
from a single Fourier coefficient with frequency ui' = = 2:^ 

we will get a Fourier coefficient over the period [Kt—^ : Kt + ^] 
of pulse K : 

~ Tl Tl 

dk K ~ Snexp[27rifc(-pa — 1)K] ■ sinc[7rfe( —a — 1)1 (18) 

’ k k 

where s and d with subscripts k, K denotes the complex 
Fourier coefficient of the sample. For n = fc eq GU reads 

dk.K = Skexp[2'Kik{a — 1)K] ■ sinc[7rA:(a — 1)] (19) 

which means that, since a ~ 1, the sinc-factor is approximately 1 
for k j^^^jj.Thus the relative phase of the A'*'' and the {K + 
1)*^ pulse gives access to the model parameter a. 

Let us now derive the Fourier coefficients for the full spectrum of a 
periodic, purely real signal, for such a signal, Sn = Inserting 
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this relation and denoting the phase as arg(s„) =: we get: 

OO 

dk,K = a„sinc[7r(na — fc)]- (20) 

71=1 

■ exp[i(2TvnaK + (l?s„)] + =exp[— i(27rnaii' + 3>s„)] 

'-“-' A 

/ '-' 

L II A 

( 21 ) 

where k — ,d = k — k . This formula can be written as 

dk,K = ^kndn- The main phase information again is to 

be found in term I of lEB for n — k. However this is not the 
only term relevant in analysing a single coefficient Assum¬ 

ing n = k ± m the sine function slowly converges towards zero 
as — and determines the factor for coefficient 1. Coefficient two 

m 

together with the sine function goes like 2 k±m m > 0. Thus in 
the limit of big difference in k and m, both terms equally contribute 
and the phase relation deforms more and more to an elliptic one. 
We emphasize that for small detuning, the sine-term is nearly zero 
for k ^ n. In this case we may calculate only with the diag¬ 
onal matrix and thus correct the data for the phase A<l?(a) = 
exp[i27rnaA']. Thus for a ■ Umax ■ Kmax <K 1 the whole prob¬ 
lem of detuning may be described, with a negligible error by the 
phase shift model in Sec. 12.61 

2.4 Phase model 



Figure 4. The phase en'or introduced to a signal by white noise scales as 
the expectancy value of the arctan of the signal to noise ratio 


Rather than integrating this precisely one can use the following ap¬ 
proximation without significant loss of accuracy 


:=(arctan(exp[-2f]))p(f|d) 

OO 


(26) 

-<E>,^+27rfc)2 

-^—?r:2- Mc 




(27) 


Even for a perfectly tuned signal, we still have to face the noise 
introduced by the receiver and antenna system temperature on the 
phase. Again we will have to assign a measurement model for the 
data phase according to the signal. The noise passes the same linear 
transformation than the true Fourier coefficients and we conclude 
that 


OO 

dk,K ~ ^ ^ -^knl^n “F (22) 

Ti—1 


where n is a white noise process with variance . Since A is de¬ 
pendent on the true signal phases, the inverse problem as a whole 
is quite complex. But if we neglect term II of (HD we may invert 
the remaining matrix A without considering the true signal phases 
and apply it to a given dataset. 

OO 

'^A~ldk,K = l|sn||exp[i'l>s„] -|-n„,x (23) 


fe=i 



Akn '■= sinc[7r(na — fc)]exp[i27rnaA'] (24) 


This is possible since term II destroys phase information in a very 
smooth way and - as will be concluded below - high coefficient 
numbers will have higher phase errors due to the drop in signal 
amplitude and the relative growth of error. Consequently, higher 
coefficients do not give as much information about the phase and 
thus about a anyway. Having preprocessed the data with we 
arrive at the noisy signal and the phase carries information about 
. Thus, given a and having processed the data vector, we may 
now fit for the average signal phase and variance by imposing a 
suitable model. We may estimate the error introduced in the true 
signal’s phase by inspecting eq. ((ID and Fig.El 


tan(A<l>) 


Nl 

||s|| crnexp[2/] 


(25) 


This approximation is valid only for a good signal to noise ratio 
(S/N), where we also expect the most significant evidence to be. 
For low S/N it underestimates the error and we must numerically 
integrate the complete formula to get an error estimate. We em¬ 
phasize that this is the very step where the confidence of the phase 
data and thus the weight in later calculations is calculated using the 
actual amplitude data point measured together with the statistics 
gathered. itself is parametrized by a so-called wrapped Gaus- 
siarQ and a mean of and variance we assume a prior of 
zero average and large variance. 


2.5 Joint probability for phase 


Finally we arrive at the joint probability of the phase model calcu¬ 
lating analogue to the amplitude: 


P(^d,^s|T, cra.„,a) = exp[-i 


($d/(^d)-^s) mod27r||^ 


-F 


(#s — ^s) mod 27r||^ 


]/C 


(28) 


Where T := {f, C7f,a*3, $s} will be the parameters of the full 
statistical template reconstructed and $d' = ^d' (^d) is the phase 
correction according to dSll or other corrections for the assumed 
phase shift. Integrating out we find: 


P{^d\T, crs„,a) = 

l||(#d'(^d)-¥A mod27r||2 

- ^^”2 <+< 


(29) 


^ a wrapped Gaussian is literally a gaussian distribution wrapped around 
the unit circle by identifying all points modulo 27r 
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We now again invert this relation using Bayes’ theorem and 
take a MAP-Ansatz on our model parameters o. 


2.6 Phase shift model for a single epoch as reference 

In order to generate ToAs we will have to determine the relative 
shift in the time of arrival of the pulses compared to another epoch 
taken as a reference. For the sake of simplicity, we assumed that the 
pulsar period does not to change with time (which is only a matter 
of interpreting the relative phase correctly or further imposing the 
detuning model). The pulsar’s signal is assumed to be slipped by a 
time Af compared to the overall period. This amounts to a change 
in the phase of the Fourier coefficients given by: 

rffc = / df d'f exp[—— Af] = ■ exp[ia;fcAf] (30) 

Thus, having reconstructed the template characteristics T = 
{/fci ^k, rr*„} for the first time interval, the subsequent inter¬ 

vals may be analysed by carrying out a parameter study over At. 
This is done by shifting the measured phases —>■ ^d' according 

to 1301 and taking e.g. a MAP-approach over the joint probability 
of the phase iu, now varying At, not a which is assumed to be 1. 
We now are also able to give the approximation for a ~ 1. We de¬ 
rived in this case a phase shift of A<f>(a) = exp[i27rnaiT]. Com¬ 
paring with OOb . the problem of detuning for small a may be ap¬ 
proximated by testing all pulses for a shift of 


At^aK-T (31) 

dependent on the pulsar period number K. All practical cases of 
detuning fall in the category of the aforementioned reduction to 
systematic phase shift. 


2.7 Reference independent difference phase model 

The simplest way to form a reference template is to select a single 
observation. However this relies on the chosen observation being 
statistically representative of the sample as a whole and would dis¬ 
card the extra information available in the whole dataset. 

This in principle could be mitigated by building a total statistical 
template over all epochs observed. We discarded this method for 
two reasons. First of all it assumes that the pulsar template shape 
and statistical behaviour is the same in every single epoch (which is 
not the case for e.g. moding pulsars). Furthermore forcing the tem¬ 
plate to take the same probability density in signal amplitude for 
distinct epochs is too strict an assumption as for example ISM vari¬ 
ations modulate the signal amplitude systematically. Instead, one 
may formulate a probability for the relative shift of two distinct 
epochs, given a certain epoch’s statistical template is correct, and 
then demand that the probability distribution has to agree for all 
templates at once. We depicted this process schematically in Fig. 
12.71 The calculated probability on a relative phase between two 
epochs, given all knowledge gathered, is a clear, basic statement 
coming without the notion of a “reference epoch”. It can be incor¬ 
porated in further analyses without introducing a direct bias caused 
by the phase information of statistical templates used since the ex¬ 
act phase assumed in the template drops out of the calculation. The 
knowledge gathered just filters the data to be compared in a clever 
way. 


Mathematically speaking, we demand that 

= A$W|Ti...r„) = 

= (32) 

3 

The formula for one factor of this product may be deduced from 
l l29t by taking the product of the equation for two epochs’ shifts 
A<Fi, A<F 2 = A<Fi + A<1> and integrating out the difference to the 
used template, A$i. A factor then reads: 

'V, = A$«|r,) = n 

k 

•exp[^^((3.d,fc«}T,- - -fcAcfi)"] (33) 

Where the sum of the new phase vari¬ 

ances estimated from the epochs’ data, given Tj holds. The formula 
deduced simply states that the difference of the filtered mean phases 
of the two epochs under consideration is the mean phase difference 
times the Fourier coefficient number. 

While the gathered probabilities on phase differences are reference 
free, in order to output a TOA, we have to define that the absolute 
phase of a certain epoch amounts to zero. In the classical proce¬ 
dure, the template was assumed to have zero phase. Additionally 
there exist several conventions on defining the physical point on a 
profile where the phase is zero, including but not limited to defin¬ 
ing the mid of the profile at the “center of mass” of the profile or 
at the highest peak. To that extent we declare a “reference epoch” 
which’s To A is declared to be exactly the timestamp of the rising 
edge of the first bin of the pulse. As we assume the epoch with 
the highest S/N-ratio to have the lowest variances in its statistical 
template as seen from the others, we choose this one to be the ref¬ 
erence epoch. While, per definitionem, this TOA’s relative phase is 
exactly zero, the corresponding ToA has to have a variance to be 
fitted into TEMP02. Thus we calculate the variance of the refer¬ 
ence as it would have been determined by all other epoch’s relative 
shifts as 


1 





(34) 


For the other epochs, the probability distribution of the relative 
phase shift provides the correct error estimate on the ToA auto¬ 
matically. 

Another big advantage of parametrizing the ToAs by their differ¬ 
ence is that 132b can be calculated independently for every tem¬ 
plate and added logarithmic. This can be done since the measured 
variable, phase difference, is the same for every pair of epochs, 
no matter which statistical template is assumed for the moment. 
This gives us the freedom to rasterize the probability distribution 
as a whole without preferring one template over the others and stay 
in the Bayesian picture until reducing the probability distribution 
measured into mean phase differences and variances in the very 
last step. This becomes essential when also short epochs with weak 
signal over noise ratio are to be included in the analysis. In this 
case multiple phases are still plausible e.g. mistaking one peak for 
the other (see also fig. [7] and Sec. 13.2b since the yielded probabil¬ 
ity distribution for one template may be too far from the gaussian 
form to be reduced to a mean and variance. The ToA fxoA of epoch 
i with mean phase shift Ad?^®^ and variance cr^®^ w.r.t. the declared 
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Figure 5. Overview of ToA generation: The statistical imprint of the various epochs (A) is used to generate statistical templates (B). The data of the epochs 
(D) is non-linearly filtered using the statistical templates to yield a probability distribution (C) for the shift of every epoch relative to a reference epoch. For 
every epoch, the mean phaseshift in topocentric pulsar periods is then added to epochs MJD to calculate the specific ToA (E). The error is calculated as the 
variance of (C) for this shift. The reference epoch’s error on the ToA is determined as if it would have been timed by every other epoch. 


reference then amounts to: 

fXoA = frisingedge + Po,i X ± (35) 

where Pq,; is the folding period of epoch i and the phase difference 
takes values in the interval [—.5 : .5]. 

Developing an interface to Bay esian extensions of TEMP02, like 
TemDoNest jLentati et alj|^14ll . to communicate the whole proba¬ 
bility distributions on the relative phase information to the pulsar 
timing code would be a further step to enhance the statistics on the 
ephemeris’ parameter sets. While this may be desirable in the fu¬ 
ture we simply reduce the yielded probability distributions to their 
mean and variance and calculated ToAs directly comparable to the 
ones from tools like pat included in the PSRCHIVE toolset and 
processable by TEMP02 without modifications. 


3 APPLICATION TO DATA 

Having derived all relevant equations we may now apply the formu¬ 
lae upon given datasets and test the approach. Starting with simu¬ 
lated datasets we afterwards evaluate a dataset of PSR B1133-1-16 
with 2041 consecutive single pulses consisting of 1024 bins each. 
Furthermore we determined ToAs from data of PSR J1713-1-0747 


and analysed these ToAs with the TEMP02 software package com¬ 
paring with ToAs generated using “pat” from PSRCHIVE and a 
very accurate classical analytical template. 


3.1 Simulated data 

We wrote a simulated data generator serving single pulses of non¬ 
fluctuating shape f but multiplied with gaussian red noise to ac¬ 
count for a highly fluctuating radiation process and analysed the 
convergence of the amplitude model in Fourier space. The clas¬ 
sical average is in this case the statistically optimal procedure to 
determine the pulse profile since the assumption that there is ex¬ 
actly one profile holds, and a quickly converging law of large num¬ 
bers applies. Figl^depicts the reconstruction from ten, hundred and 
a thousand pulses comparing classical template integration with 
Bayesian reconstruction. Both methods perform equally well, as 
was expected. However, the Bayesian reconstruction also gives us 
an estimate on the profile stability and thus a better understanding 
of the remaining statistical uncertainty. Reconstructing a constant 
signal shape is a difficult task for a log-normal Bayesian model 
since the guess of a non-fluctuating shape is statistically only fea¬ 
sible by collecting lots of data. Thus, even though the test seems 
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After ten pulses, the Bayesian method has 
equally well reconstructed the signal as the 
average method, however it provides a han¬ 
dle on the signal fluctuations still possible. 
The seamingly systematic overestimation of 
the signal shape by the expected mean am- 
plitueds comes from the log-normal expo¬ 
nential model used and the non vanishing erf 


Fourier coefficient 
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10 


100 


A hundred pulses already give a profile ac¬ 
curate to over 90% over the essential Fourier 
modes of the signal. Notice how the real er¬ 
ror on the averaged f is limited to the still 
possible statistical fluctuations erf. 


A thousand pulses give a confident template 
up to high coefficients. The Bayesian recon¬ 
struction estimates the average shape to fluc¬ 
tuate below 20% of its mean. The absolute 
error not detected at the very high coeffi¬ 
cients may be explained due to low assumed 
signal and low real signal to noise ratio. The 
dominant error source seems to be a sys¬ 
tematic overestimation due to the log-normal 
signal model. 


Fourier coefficient 


Figure 6. Reconstructions of the original signal shape by both Bayesian and classical methods and their errors after 10, 100 and 1000 pulses respectively. 
The lower half shows the errors of the estimation to the original signal shape and the Bayesian estimation of the remaining uncertainty cjf on the expected 
log-normal mean f 
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simple, it is actually a very strict test for the Bayesian measurement 
model to pass as the correct solution for infinite observation time is 
delta peaked. The algorithm is expected to perform much better on 
real data for which the profile shape measured is much more fluctu¬ 
ating for small integrations since it has a way of incorporating the 
instabilities of the profile. Since in an observation of finite length 
the algorithm will never show zero variance on the profile shape, it 
estimates the profile intensity slightly but systematically too high. 
This gives rise to a problem if one is set out to measure the intensity 
of radiation naively by interpreting the statistical template as a way 
of describing the profile. The template gives a statistical way to de¬ 
termine all compatible intensities or the probability distribution of 
these, but not the “true” intensity directly. 

The high modes exhibit a large absolute error. This is an effect of 
low signal to noise ratio at imperfect numerical integration. For 
low amplitudes, the Hamiltonian for different values of f becomes 
rather flat and thus the outcome just reflects the prior we have been 
taken. In this case the prior was a log-normal Gaussian model with 
2(Tf around the average found. Notice that the err the algorithm 
suggests (leaving aside the singularities of the profile) captures the 
true error (except for singularities and the imperfect numerical inte¬ 
gration of high modes) giving a handle on the remaining statistical 
uncertainty. 


3.2 Determining phase shift 

We used a template with four peaks of Gaussian shape, sizes rang¬ 
ing from 1.8° to 3.6° and generated a dataset with an overall signal 
over noise ratio of — VldB Ri We compared two independently 
generated datasets of one hundred pulses with 1024 bins each, 
shifted by 1.8°, which amounts to a shift of 5.12 bins. As Fig|3 
shows, the Bayesian reconstruction yields AT> = —1.81° ±0.04°. 
The other maxima in the probability distribution are of smaller size 
but reflect the the spacing of the templates peaks and the possibility 
of mistaking one peak for another. Having a local signal to noise ra¬ 
tio of four for every peak and a resolution of 360°/1024 Ri 0.35° 
available, we may estimate the uncertainty of every peaks location 
to be 1.3°Q in every single peak in every single pulse. Having de¬ 
rived the phase drift using 200 x 4 peaks, we classically calculate 
the remaining error due to the receiver noise to J^ 2 ^ 4 oo ~ 0.046°. 
In the fluctuation free case (the simulated emitted signal is exactly 
the four peaked shape before the receiver noise is added and the 
signal is squared) it can be shown that this error is the minimum 
reachable error. The test of our algorithm reaches this accuracy, 
too. 


3.3 Real data 

Generating templates from real data yields reconstructions of sur¬ 
prising confidence. Fig. [8] depicts the amplitude reconstructions 
from certain numbers of consecutive pulses. In each figure, the 
long-term convergence gathered from the whole dataset may serve 
as a reference. We emphasize that for data gathered by a pulsar 
with fluctuating shape, classical and Bayesian reconstruction are in 
general not expected to yield the same curve, since fluctuations are 


® Calculating the exact timing precision achievable is highly nontrivial. We 
estimated this error as follows: Assume a single peak at a S/N of one to be 
locateable to about its width, having a signal to noise ratio of four should 
increase this precision by a factor of \/4. Adding the errors on the bin and 
resolution quadratically, we end up with an average of 1.3° per peak. 



-2 -1.95 -1.9 -1.85 -1.8 -1.75 -1.7 -1.65 -1.6 



- O 




A<I> in degree 


Figure 7. Unnormalized logarithmic probabilities for phase shifts. The 
comb structure comes from the four peaks present in the template. 


handled differently by both models. Both can only be compared to 
their own long-term limit. Furthermore the statistics of the profile 
seem to change with time for the dataset examined. We will analyse 
this short term behaviour in the next section. 

The Bayesian reconstruction does exhibit lower noise at high 
fourier coefficients and converges rather quickly to its ultimate 
form. The main features such as peaks and curvature appear already 
at 64 pulses while the classical average remains dominated by fluc¬ 
tuations occurring. However systematic offsets in the Bayesian re¬ 
construction appear. These may largely be backtracked to the actual 
pulsars intensity changes and nulling periods. A numerical fluc¬ 
tuation of the algorithm is rendered unlikely since each point of 
the construction is calculated independently and thus an equal rise 
in all coefficients is highly improbable. The classical average also 
shifts upwards and downwards with varying intensity, but the effect 
is mostly indistinguishable from noise. For this real life example, 
the method derived is clearly superior to the classical method in ar¬ 
eas of low signal to noise ratio, given that it converges very fast to 
its ultimate form and showing denoised systematic profile changes. 
It is reproducing the classical outcomes in the lower coefficients 
and not picking up the noise of the higher coefficients. After 128 
pulses a confidence of about ten percent is already reached for all 
coefficients present. This paves the way to investigate changes in 
the radiation profiles on these scales. We could further diversify our 
pulses in intensity classes and perhaps in different modes of radia¬ 
tion given the contrast and fastness of the envelope estimate. Fig.|3 
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When looking at the high coeffi¬ 
cients, the reconstruction is already 
quite confident. 



Certain peaks and drops which are 
to be averaged out do not even ap¬ 
pear in the Bayesian reconstruction 



While the classical average is still 
not showing a clear ascend for the 
middle coefficients, the reconstruc¬ 
tion clearly shows the final features. 



The reconstructed curve is system¬ 
atically shifted downwards exhibit¬ 
ing a change in average intensity 
while keeping the form of the curve 
unmodified. 


Figure 8. Template reconstruction for a subset of pulses 


shows the differences of the two reconstruction methods in time do¬ 
main and the single pulse probability density function as described 
by the Bayesian statistical template (depicted as grey shade): While 
the classical average just assumes the pulsar to have a certain pro¬ 
file shape, the Bayesian method is rather based on exclusion and 


describes profile shapes compatible with the data gathered. Thus, 
after as few as ten pulses, the Bayesian reconstruction in time do¬ 
main already resembles the final form of the radiation profile. No¬ 
tice that it still allows for signals out of the profile, since statistics 
do not exclude low signals after ten pulses. 
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Figure 9. Template reconstruction after ten respectively, a thousand pulses. While the classical averaging assumes a certain fluctuating form to be true, the 
Bayesian template and single pulse model shows which single pulses are typically compatible with the statistics of the data gathered 


Moving frame reconstruction 



Figure 10. Depicted are the Bayesian and classic reconstructions from two sets of 50 pulses shifted by a single pulse. The change of the classical template is 
dominated by the spiky fluctuations as is evident from the change in the right peak. 


3.4 Reconstruction stability to fluctuations 

The workings and main differences of the two algorithms become 
evident when analysing the change of the classical profile and sta¬ 
tistical template introduced by small changes of the dataset. This 
is examined by comparing the reconstructions of a window of fifty 
pulses moved along the dataset. An example of stepping one pulse 


further is shown in Fig. Dropping one pulse and inserting an¬ 
other should not change the overall template much. If the tem¬ 
plate changes, we expect it to change smoothly with time. For 
the Bayesian reconstruction, this property can be easily observed 
whilst individual fluctuations dominate the classical average pic¬ 
ture. This can be deduced from the right peak of the template. While 
the algorithm derived gives a rather smooth increase in intensity 
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over the whole peak, the classical average is dominated by spikes 
of individual fluctuation. One could argue that the classical spikes 
might just be smoothed out of the form of the fourier transform. 
This is contradicted by highly localized changes in the Bayesian re¬ 
construction that are statistically significant. The change of the left 
peak may be seen as an example for a localized statistical change. 
The classical average does not change much there, but the Bayesian 
reconstruction shows a significant rise of the left flank of the peak. 
The algorithm is found to behave in such a way over the whole 
dataset. Furthermore it incorporates the fluctuation seen in real en¬ 
velopes. This may increase accuracy of derived parameters like the 
ToAs since e.g. the confidence of a peak appearing at a certain point 
should be, but is not, taken into consideration by classical timing 
procedures. 


3.5 Nulling 


Given a set of proposed models, like different statistical templates, 
Bayesian statistics can also assign a probability on how likely every I 
single model is to describe the data, given that one of the proposed u 
models is true. An example for the ability to discriminate between 
different models is nulling detection. Two sets of parameters are 
analysed within the framework: One set of f, cfis generated from 
a training set of nulling periods taken from the first 50 pulses, the 
other one is taken from the overall reconstruction as above. The 
likelihood of a model i described by a set of parameters rrii is com¬ 
pared as 


V{mi\d) 


ffDsP(mi,s,d) 

Ei//DsP(mi,s,d) 


(36) 


which may be derived straight-forwardly from Bayes’ theorem. De¬ 
ciding between the white noise model and the model derived from 
all thousand pulses, this formula gives the probabilities given in 
Fig.im for a certain frame to be a nulling pulsar period. These 
probabilities could be used as weights when determining further 
parameters without biasing the statistics by deciding whether there 
was a pulse or nulling. Furthermore when looking at the plot, the 
Bayesian probabilities resemble our state of uncertainty for pulses 
which are in the 50% region when a by eye decision should be 
made. Fortunately, the method is able to quantify this uncertainty 
for us automatically. Nulling information could in principle further 
improve the accuracy of detuning and time of arrival analysis. 


3.6 Determining Times Of Arrival 

3.6.1 ToAs from simulated data 

Timing simulated data sets is a precise way to test the algorithm 
for theoretical performance since the true signal to be found out of 
noisy data are known. We implemented the phase shift model (as 
described in Sec. 12.6t . We generated a statistical template out of 
10,000 simulated pulses and then measured the accuracy on timing 
= 10, • • • , 50, 000 pulses with that fixed reference, where we 
used a MAP approach. Statistics was gathered over 20 randomly 
shifted datasets for each value of N recording the absolute error 
on the ToA and deriving the mean error and standard deviation to 
quantify the accuracy of the algorithm. The algorithm performed 
as depicted in Fig.[T2 As expected, the systematic phase error was 
well bounded by the fluctuations given by the test datasets. The de¬ 
viation from the true value followed a 1/ s/N law for large N as 
expected and reached the accuracy that is information theoretical 
possible. When there were more than 10,000 pulses in the dataset 


Time domain probability for nulling 



Figure 11. Discrimination-less decision: Bayesian analysis gives a proba¬ 
bility for a certain pulse candidate to be explained by white noise only. In 
subsequent data-processing, this probability may be used to weight signals 
that are only present in non-nulling phase instead of a threshold algorithm 
producing false positives or negatives. 


to time, a systematic error appeared. This was expected, since the 
reference statistical template itself (generated from 10,000 pulses) 
does not contain more accurate phase information. Since we were 
measuring an absolute, not a relative shift here, the error contained 
in the template becomes systematic. The sudden drop of the sys¬ 
tematic error between 10 and hundred pulses is a random sampling 
artifact of the simulation involved. The average over the discrete 
maximum values taken was in this case accidentally zero. We also 
evaluate the reached accuracy for different S/N-ratios in Fig.[T2 It 
is interesting to notice that the algorithm returns a rather flat proba¬ 
bility distribution rendering a lot of phase values likely. This can be 
deduced from the high errors encountered leading to a plateau for 
low values of N. At a certain point the probability distribution be¬ 
gins to peak and accuracy increases with a jump, locking onto the 
signal. This may be explained by passing the point where different 
peaks of the profile shape can be distinguished statistically. This is 
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Error on ToAs measured by a template from 10.000 pulses 



Number of pulses 

Figure 12. Test of accuracy over epoch size of the phase shift model as 
described in sec. m a variable number of pulses is used to generate ToAs 
and the algorithm is examined for statistical (red) and systematical (green) 
errors. 


Accuracy of ToAs for different SNRs 



Figure 13. Test of accuracy for different S/N-ratios of the phase shift model 
as described in sec l2.6l the grey dashed lines all follow a 1/ \/lV slope. 


not a drawback of the method hut reflects the fact that, also in the 
ideal case, the error follows the 1 / v5v slope only for large N and a 
unique maximum. After that point the performance follows a stable 
l/v5V curve. 

We also simulated the whole process from measuring epochs to 
determine the ToAs therein. To this end we simulated 20 epochs 
with 100 pulses each. Then we generated statistical templates from 
each epoch and used that to time the relative shifts of every other 
epoch. Then we naively averaged the maximum a-posteriori val¬ 
ues of the shifts and their variances using classical formulae for 


sigma measured from posterior probability distribution 



Figure 14. The upper panel depicts the remaining uncertainty (in an arbi¬ 
trary scale) when the row epoch is measured by the column epoch. Notice 
that there are some combinations of epochs which lack statistical similari¬ 
ties and give greater uncertainties. The lower plot shows the remaining eiTor 
on the ToAs of the epoch when combining the measurements from above. 


measurement value addition to find the ToAs and compare them 
with their true values. The dataset contained 2000 single pulses. 
The results of this simulated measurement is displayed in Fig. 1141 
We reach an accuracy of 1.9 x 10“® radians, which is nearly the 
theoretical maximum for 100 pulses per epoch as can he seen from 
Figim When timing with only 10 epochs, the accuracy dropped 
to about 3 X 10“^ radians. In that case, the theoretically possible 
accuracy for each epoch was missed. That practically means, that 
gathering knowledge of later epochs can improve measurements of 
the past hy reprocessing them with the newer statistical templates. 


3.6.2 ToAs from real data 

The method carried out in such a way fails when tested with real 
data. Reducing the phase shift information of every statistical 
template to classical mean shifts and their variances before cor¬ 
relating the different templates’ measurement flaws the precision 
of the method. This can he understood easily considering very 
noisy templates. These may cause different very likely shift 
values separated hy a larger interval of unlikely phase values, e.g. 
different peaks of the profile may he mistaken for each other. 
When reducing, the mean and variance of only the most likely 
peak was determined giving a rather low sigma value on it, since in 
its neighbourhood, the prohahility density can he approximated as 
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Table 1. Comparison of timing residuals of observations taken at 1350MHz and model accuracy as reported by TEMP02 for classical and Bayesian ToAs. 
The smallest values in each category have been highlighted. 


Classical timing Bayesian timing 

TEMP02 fitting mode rmsAVrms red. rmsAVrms 


RMS 

551ns 

_ 

_ 

431ns 

_ 

_ 

weighted RMS 

240ns 

455 

23 

163ns 

35 

2 


a gaussian. This sets a too large weight on the statistical template 
compared with the other templates if the algorithm accidentally 
locked on the wrong peak. The consequence is a large bias in the 
direction of the mistaken peak. However, if one does not reduce the 
data template by template, but examines the probability density on 
the relative phase shift of two epochs using all statistical templates 
at once, one circumvents the possibility of this local fallacy and 
gets an overall correct probability distribution. 

An algorithm obeying these caveats has been outlined in Sec. 12.71 
By measuring the same quantity (relative shift of two epochs) 
with different statistical templates we may rasterize the probability 
distribution for that quantity considering all references at once and 
in parallel. Since pulsar timing expects us to state a single ToA 
per epoch and the expected error on it, the MAP ansatz taken in 
the very last step is now justified. The reported errors on the ToAs 
are found to be in agreement with the RMS reported by TEMP02 
yielding low values. 

Using 10 second observations of PSR J1713+0747 at 1350MHz 
with a total of about 19h observation time in 36 epochs observed 
in Effelsberg over about 1.9 years we tested the algorithm. Even 
though single pulse observations carry much more statistical in¬ 
formation that the data at hand, the statistical templates generated 
show variations. The ToAs generated reach an RMS of 431ns and 
a weighted RMS of 163ns at a fit of 35. The classical method 
yields 551ns unweighted or 240ns weighted RMS with a of 
455. We collected the results in Table[T] 


3.7 Moding 

We examined the algorithm’s timing behaviour on a moding pulsar. 
We used a dataset of pulsar B0329+54 at 21 cm consisting of 5000 
single pulse observations with 1024 bins each. Using the same for¬ 
mula ( I36t as for the nulling anlysis, the alorithm decides if a single 
pulse or a subset of pulses was emitted in a certain pulsar mode 
described by statistical template mi (consisting of the same set of 
parameters like Ti ). We auto-generated a moding analysis by gen¬ 
erating a statistical template T over all data. Then we analysed the 
probability of each given single pulse to appear, given the template 
T describes the radiation correctly. Basically this means evaluating 
the joint probability for the phase l l28b and amplitude (by integrat¬ 
ing i ll lb over s). We ordered this list of probabilities descending 
and initially divided it into rimodea sections of equal length, where 
we assumed a certain n„iodes to be the correct number. 

In principle, the number of statistically distinct modes is also sub¬ 
ject to uncertainty. One may also calculate the probability, that a 
certain rimodea holds. This could in principle further improve the 
results derived in the following but is of secondary interest for a 
first analysis of the behaviour of the algorithm. 

Given this initial assignment of the single pulses to modes, we 
started an iteration procedure. In each step, a new set of statistical 
templates according to the weights of the previous step are calcu- 



Figure 15. We overlayed the average profile curves of different modes over 
a scatter plot of the single pulses coloured according to the probability of 
belonging to mode 1-3. The dotted shape amounts to an average over all 
pulses. 


lated: }. These were then used as models in eq. 

l l36b to assign the probability for each individual pulse to belong 
to a certain mode. These probabilities become the new weights for 
that pulse. We experimented with this iterating procedure and found 
about five steps to be sufficient that the difference from step to step 
is negligible. 

The algorithm allows an unambiguous assignment of single pulses 
to a certain statistical template except for a very small number of 
pulses. The statistical nature of single pulses makes it difficult to 
grasp what classification is happening. Thus we decided to make a 
scatter plot (Fig. [B} of 5000 single pulses assigned to three modes. 
The single pulses were blurred to simulate a density kernel and 
plotted with an opacity of 1% in the colour of the probability to 
belong to a certain mode. For orientation, we overlayed the aver¬ 
age pulse shapes each mode would generate if it was the sole mode 
observed along with an average over all pulses, drawn in dotted 
white. 

The term moding usually refers to the appearance of few distin¬ 
guishable shapes of the integrated pulse profile. As a single pulse 
profile is very different from its neighbouring pulses, it was very 
common to assign different modes to consecutive pulses. Thus the 
question arises, whether to still call this behaviour moding or not. 
On one hand, we could try to further develop the algorithm to make 
mode switching on such small timescales very unlikely. Then the 
algorithm would pick up “modes” in a more classical sense. On the 
other hand, the integrated profiles of these “sub”-modes are quite 
distinct and certainly their ratio has an impact on the average profile 
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ToA phase in radians 


Figure 16. This figure compares the phase detection performance of a single or a moding statistical template on 25 respectively 50 pulse integrations. 


shape over a few minutes integrating and thus also on timing (as we 
will examine below). Understanding the relationship between these 
sub-modes and the astonishingly stable moding behaviour over a 
larger integration time could lead to a deeper understanding of the 
conditions in the pulsar magnetosphere. 

These submodes also seem to be correlated over the whole profile. 
For example, mode three in the figure is the only one having an 
earlier rise of intensity at 230 degrees and additionally a very low 
slope in the middle of the profile. 

Taking these different correlations into account can also archive a 
more accurate timing and make the timing stable against different 
integration times and noise. Fig. M shows two examples of how 
a moding template on single pulse level can improve detecting the 
phaseshift of a few tens to hundreds of pulses. Statistical templates 
were generated from 5000 pulses and later used to time simulated 
epochs consisting of consecutive subsets of 25 in the upper and 50 
single pulses in the lower panel, taken from the same 5000 pulses. 
The most likely value for the shift of each set was binned to gener¬ 
ate histograms of the observed frequency. Knowing that there was 
no subpulse drift over the 5000 pulses as a whole, we expect the 
phase to be measured as zero for every subset. This is indeed the 
case, if we assume the pulsar to radiate in different modes (depicted 
in green). When run over subsets of 25 pulses, the non-moding tem¬ 
plate (depicted in red) fails to detect the correct shift by repeatedly 
mistaking one peak for the other. Unfortunately the few outliers 
which would have identified the right peak are scattered around 
zero with a systematic bias. We suspect this bias to be caused by 
the single statistical template trying to cover two or more very dis¬ 
tinct modes of radiation. Doubling the number of pulses in a subset 
fixes the problem of not matching the right peaks. However, there 
is still a bias and the results have a larger variance than the moding 
template. The moding template detects the correct pulse phase with 
satisfying accuracy while showing a lower variance. This can be 
understood having a look at the generated modes. The maximum 
value of the second peak for every single mode is further left for 
pulses which are less intense. In the average picture over all pulses 
this correlation cannot be accounted for. When looking at subsets 


of pulses however, the lower probability of reaching a high inten¬ 
sity mode shifts the peak to the left. Consequently, the reference 
template is detected to be shifted to the right. The ansatz using one 
statistical template reduces this inaccuracy to about half a degree, 
which is still a low value in the light that e.g. the peaks of the first 
and second mode, if we assume three modes (see Fig |15b . are sep¬ 
arated by 1.5 degrees and the average taken from 50 pulses is still 
very noisy. Using a statistical moding template, the variance in vari¬ 
ability exceeds the inaccuracy introduced by integrating a smaller 
subset than the one the template was generated from. 

We conclude that assuming even a low number of modes to be 
present can significantly improve the timing results both in vari¬ 
ance and systematic error. 


4 CONCLUSIONS 

We developed and evaluated a log-normal Bayesian model for sin¬ 
gle pulse analysis of pulsars. The algorithm described is able to 
reproduce the results of classical averaging procedures when con¬ 
sidering non-fluctuating templates up to the approximations taken. 
The method is surprisingly versatile in situations of weak signal 
or a fluctuating template shape, since it is shown to be more ro¬ 
bust against fluctuations. Thus it may enable one to examine fast 
changes in the radiation profile or distinguish different modes of 
radiation, such as the nulling analysis, an example for model dis¬ 
crimination, has shown. 

Secondary parameters’ reconstruction may be implemented easily 
by calculating their imprint on the data and then evaluating the 
joint probability using the model derived, as was demonstrated with 
phase drift on simulated epochs. The parameter studies derived in 
such a way profit from additional statistical information incorpo¬ 
rated in the template automatically since only the statistically sig¬ 
nificant part of the data will have an imprint on the probability dis¬ 
tribution of the parameters under examination. 

The benefit of rather understanding which parameter values are 
compatible with the data gathered than trying to reconstruct one 
possible signal out of the dataset might be worth the effort im- 
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plementing a Bayesian reconstruction for the parameter of your 
choice. This may give statistical access to hitherto poorly exam¬ 
ined parameters and topics about the radiation process and mag¬ 
netosphere by providing strong and reasonable reconstructions for 
noisy and fluctuating data where a classical estimator leaves us with 
a definite but perhaps insignificant answer. 

A first test to generate ToAs from real data showed that the al¬ 
gorithm developed can reduce values while reaching the accu¬ 
racy of classical methods even though we did not use single pulses 
but few second integrations and approximated some calculations. It 
gave correct ToAs even though the statistical templates used were 
generated from the same data that was to time. The classical way 
of template generation is known to affect t he results of timing if the 
same data is used for template generation dHotan et al]l2005 ). 

If one intends to use the method on routine basis as an alternative 
to the classical one, the data volume to keep for logging and gen¬ 
erating reproducible output increases drastically. One would have 
to find a way of reducing the data without loosing the possibility to 
reprocess it or re-validate it. 

Data compression and perfor mance might be impro ved at the same 
time for Bayesian methods, as Ivan HaasterenI ( 120131) already points 
out. Computational feasibility sets a limit to the potential use of 
Bayesian analysis, but there exist read y to use methods for speed¬ 
ing up Bayesian calc ulations, such as Tayloret ^ j2012|) or the 


Metropolis-algorithm dMetropolis et al. 


19531) . The formulae pre¬ 


sented may also be sped u p by analysing them in perturbative 
manner. lEnBlin et al.l d2009l) and references therein develop an in¬ 
formation field theory, taking the information Hamiltonian H = 
— log P(- • •) as a starting point. This allows for approximating the 
numerical integrations involved analytically and breaking their in¬ 
fluence down to computationally simpler matrix arithmetic. 
Furthermore consolidating different data channels, like radio data 
at different frequencies, is easily possible in a Bayesian framework. 
Once the computational complexity is dealt with, the Bayesian 
method might provide a way to reduce observation time for low sig¬ 
nal to noise pulsars and give a handle on the statistical uncertainties 
involved in template generation. It gives access to a broad range of 
secondary applications such as sophisticated methods of interfer¬ 
ence detection or testing pulsars long-term behaviour for template 
changes or short- and long-term moding behaviour affecting ToA- 
analyses. 

Hence the basic analysis carried out has shown that Bayesian re¬ 
construction of pulsar templates is not only feasible, but also forms 
a more complete picture on the pulsars template shape. The addi¬ 
tional data turns out to be very valuable for subsequent analysis 
of secondary parameters mitigating the effects of insignificant fluc¬ 
tuations. Benefits known to arise by inspection of the stochastic 
behaviour in the analysis will be automatically propagated by im¬ 
plementing a measurement using the proposed Bayesian template. 
Having these manifold possibilities in mind we suspect the method 
to bear fruit worth picking and suggest further examination. 


mathematical difficulties associated with the evaluation of the pos¬ 
terior pdf. 
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APPENDIX A: CALCULATIONS 
AI Integration over g 

Starting from eq. Gic 


Hb[A] = - 


(v/d — exp[f] • g(Jn)^-!^(V^—exp[f] ■ gcTn) + f^F ^f + g^G ^g+ 


-logP(r) + -log(| 27 rcrn|^*°*| 27 rF|| 27 rG||d|) 


Vd^\Vd-2 

(Jn CTn 


(ydexp[f])'l‘^ ^ 


g + g^exp[f]'exp[f]g + g''‘G ^g 


t/d^-!j%/d +g^ ["g ^ + exp[f]^exp[f]^ g — 

rTn \ / rTn 


= Df 


+ A = 


= 1 ^'iVd + (g - ...)'D,(g - 

Z (Tn 

Integrating over g leads to: 


+ A 




Vd' 


— exp[f]^Df ^exp[f]^ + log |27rDf ^ 


+ A 


+ A = 


Vd^^ yd + log|27rD7i| 

Vd^^ v/d + f^F-^f+ log(|27rD7^||27rcr^|'^‘°‘|27rF||27rG||d|) 


- logP(T) 


(Al) 


where Df = exp[—f]G ^exp[—f] + 1 is always greater than 1 leading to a broadening of the likelihood in case of bad signal noise ratio. 
We may reformulate the matrix in the data-dependent term as 

1 — = [D7^][D/ — 1] = [exp[—f]G~^exp[—f] + l]“^exp[—f]G“^exp[—f] = [1 + exp[f]Gexp[f]]“^ (A2) 

which leads us to the final form 


Hb[A] = 


v/d^-!j[l + exp[f]Gexp[f]] ^x/d + f^F + log(|27rDf |27rCTn|^‘°*|27rF||27rG||d|) 


— logP(T) ( fTTl i 


A2 Discrete Fourier coefficients of a detuned signal 

For the signal consisting of a single Fourier coefficient with frequency u’ — = 2:^ tfie measured coefficient over a period [Kt — ^ : 

Kt + J] of pulse K is readily calculated as 


dk,K — 


fKr+i 

/ df s„exp[z(cu^ - ujk)t] = 

J Kt-?; 


xp[27ri 


(na — k) 


t] 


1 Kt+? 


2Tii{na — k) 
exp[27rifc 


JKr-X 
Kt+? 


-t] 


2nki{‘^a — 1) 


TL 

= s„exp[2.ffe(^«-l)^l 

= s„exp[27rifc(^a — l)K] ■ sinc[7rfc(^o — 1)] 
k k 


Kt-^ 

sin -Kki^a — 1) 


Gl 
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For the full spectrum of a real signal, where Sn = we get 

fKr+i ^ 

dk,K = dt y Snexp[i(a;^ - 0Jk)t] = 

JKT-^r __ 


E 


n= —oo 

exp[27ri 


n= —OO 


27Ti(na — k) 


Kt+^ 


= E 

n=l 

oo 

= E“' 

n=l 

oo 

= E“' 

n=l 

oo 

= ±E 

n=l 

oo 

-±E 

n=l 

oo 

= E“’ 


exp[27rii2^^f] ,exp[27rt ^-"°-'°^ f] 
Sn „ -hs. 


Xr+5 


2ni{na — k) 2'Ki{—na — k) 

n.-, --j 

(fc + na) (cos il?s„ + f sin <l?s„ )exp[27ri + (fc — na) (cos <l?s„ — i sin )exp[27rt t] 


27ri(fc2 _ (naP) 

, -k , / -ikcos(2^t) + nasm(i^t) ik sm(+ nacosi^^t) 

“Pl“-‘l (“*-- \le-(n.V) ’ - („■)») ' 


iCT-f 

Xr+5 


2 sin(7rna) 
2 sin(7rna) 


cos 4>s„ (ik smip.imaK) + na cos(27rna_K')) + sin $s„ {—na sm{2iTnaK) + ik cos(2TrnaK)) 


7r(fe2 — (naP) 

i sm{2nnaK){cos il?s„ k + i sin il?s„ na) + cos{2'KnaK) (cos >l>s„ na + i sin <E>s,, k) 


2sinc['7r(na — fc)] 


7r(fc2 — (na)^) 

i sin(27rnaJf)(cos ^ 3 „k + i sin <3>s„na) + cos(27rna7F)(cos 'l‘s„na + i sin i&s,, k) 


k + na 


where we now introduce k = ^ d = k — k and simplify 

OO 

= a„ [sinc[7r(na — k)]- 

n=l 

i sm(2nnaK){cos (k — d) + i sin Fsn {k + d)) + cos(27rnaK)(cos <Fs„ (k + d) + i sin Fsn (k — d)) 


(A3) 


k 


= a„ [sinc[7r(na — fc)]- 


k{i sm(2nnaK) + cos(27rna7F))(cos 4>s„ + i sin <l>s„) + d{—i sin(27rnaiF) + cos{2'KnaK)){cos "Fa,, — i sin Fs„ ) 


= ansinc[7r(na — A:)] 


k 


exp[i{2TmaK + 'Fa„)] + =exp[—i(27rnaA' + "Fs^)] 
'-„-' k 


■ ^ ^ ^kn^n 


(EB 


where we assumed 6o to be zero and used Sn =’■ an{i sin + cos $s„), a„real. 


APPENDIX B: SOME CONSIDERATIONS ON THE EVALUATION OF THE RECEIVER POSTERIOR PDF 

Having rigorously marginalized out the stationary process g, we seek to maximize the posterior probability density function with respect to 
the template parameters. When projecting this equation into Fourier space, G is a diagonal matrix and the multiplication with / transforms 
to a convolution that breaks this diagonality, leading to a coupling of the Fourier coefficients. While in the diagonal case the pdf completely 
factorizes in separate Fourier coefficients, which are independently and quickly integrated, there is only one -dimensional integral in the 
exact case. For typical NunS of 1024 the computational demand is simply too high. Thus we decided for imposing diagonality in Fourier 
space, which discards the non-diagonal terms of f leading to a solvable integral. However, this discards phase information as now / has also 
the properties of a stationary process. Consequently we have derived an amplitude-only model. An Amplitude model alone does not contain 
valuable information about TOAs since the amplitude is invariant to time shifts. Therefore, we impose a model on the phase of the signal 
based on wrapped gaussians and being evaluated phenomenologically. This model uses the amplitude information to estimate the phase error 
of an observation at hand. The phenomenological way was chosen as the exact calculations for quadrature of the stochastic signal mix the 
phases of different Fourier channels in a highly non-trivial fashion again leading to an unwanted coupling of phases present only at the single 
pulse level. 

This ansatz tries to find a compromise between numerical feasibility and accuracy. For the case of real measurements containing additional 
noise and fluctuations, we expected to grasp the uncertainties left with acceptable precision and lead to acceptable -values at the end. 
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However losing the mathematical precision, we expect not to reach the accuracy of the classical timing codes when it comes to evaluating 
exact, nonfluctuating test data sets. Indeed our round mean square errors turned out to be 25% worse in these cases while the reached was 
comparable accurate. 

The question surely arises why we did not evaluate the pdf in time domain. In time domain, the main operator acting on the data is the inverse 
of a diagonal matrix (/) plus a Toeplitz-matrix G 0 that can be further reduced to a circula r Toeplitz-Matrix by the symmetry in /. There 
exist fast ways (up to 0{N log N)) of calculating the operators scalar product with the data iNg & Panll2010h however this does not reduce 
the dimensionality of the integration when it comes to varying the template parameters of interest. Unfortunately, expanding the operator 
linearly (to develop a mean field theory for the signal field, which should lead to a very good estimate and first order corrections for the 
correlations present in the problem) does not help in this case, as the derivatives refer to specific entries of the inverse matrix. Calculating the 
inverse matrix itself scales 0{N^) or best 0{N^) in our case, and as we need N entries for the calculation of the mean field gradient, the 
computational demand is again unacceptable. A class of transformations on f that leave the determinant of the operator invariant, combined 
with a pdf on the invariants of the transformation, could make exact evaluation of the derived equations possible. In lack of such a tool we 
decided to approximate the problem in Fourier space. 


APPENDIX C: ITERATIVE SOLVER FOR FIELDS 


Finding the maximum a-posteriori (MAP) probability set equals finding a global maximum for f and Cf both at the same time. Equations 
GU and GUI explicate optimization w.r.t. one parameter only in the context of correct prior knowledge. Furthermore, we want to find the 
optimal solution w.r.t. both parameters. We may resolve these issues by assuming that there is only one maximum (which might not be the 
case as there might be several modes of radiation in the system etc.) and rely on a suitable iteration scheme. We implemented a method of 
steepest ascend. The Hamiltonian may be seen as a potential to be broken up in parts dependent on and at for every Fourier coefficient. 
For the sake of shortness let us call this part <I?(f (f), cr(f)). We now try to find the path of the steepest ascend, parametrized by t: 74 = (f, a). 
Infinitesimally, isolines may be found using dil?( 7 t) = 0. Working this out for our Hamiltonian leads to 


0 = ( 


rr^ 


1 .da f — f dt 
a dt cr^ dt 


(Cl) 


Consequently, the direction of the steepest ascend/descent follows the differential equation 




(C2) 


where orthogonality is easily shown. Since for most cases occurring setting f according to GSJ is an acceptable method while a reasonable 
update procedure of a prior to optimizing it was missing, we update a when updating f by the method, but not vice versa. Integrating i lC2b 
is approximated by 


af V / (f - f) 

{{f + {{f -foldf) 

new — ~ 

old 


(C3) 

(C4) 


and determining f^e™ as in jl 6 t . As is evident by derivation, this method is generic for a prior of the form of a Gaussian, however could be 
refined by also updating the f value by similar considerations. 


a matrix whose entries depend only on the distance to the diagonal 
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